Exploiting Rateless Coding in Structured Overlays to Achieve Persistent Storage. (L'Exploitation de Codes Fontaines pour un Stockage Persistant des Données dans les Réseaux d'Overlay Structurés)

نویسنده

  • Heverson Borba Ribeiro
چکیده

The substantial increase in the amount of information over the Internet has contributed to an extraordinary demand for persistent data storage. Centralized storage architectures are expensive, weakly scalable and vulnerable to attacks as they represent single points of failure in the system. Over last few years, peer-to-peer architectures have emerged as an alternative for implementing persistent data-storage. Open peer-to-peer systems are fundamentally scalable and cheaper than client-server approaches. However, in order to successfully build persistent storage systems using the peer-to-peer approach two fundamental challenges need to be addressed. a) To cope with the transient connectivity of peers. b) To reduce the impact of misbehaving peers. Replication is a common approach used to cope with transient connectivity in peer-to-peer storage systems. However, depending on the frequency peers join and leave the system this approach can present negative impacts in terms of storage overhead and bandwidth consumption. Peer-to-peer overlays that focus on tolerating the presence of Byzantine peers usually make the assumption that no more than a bounded fraction of peers in the system are malicious. However, estimating the proportion of malicious peers in open peer-to-peer system is not reliable. Thus, finding a scalable architecture to provide reliable and persistent data storage while coping with these issues is an interesting achievement. In this thesis we present the design of Datacube. Datacube is an efficient and scalable peer-to-peer storage architecture that provides data persistence by implementing a hybrid redundancy scheme on top of a cluster-based structured overlay. The hybrid redundancy scheme proposed by Datacube ensures data persistence and integrity despite the intermittent connection of peers and the presence of adversarial peers. Datacube relies on the properties of the new class of rateless erasure codes to implement its hybrid redundancy scheme. The analytical evaluations have shown that Datacube performs notably well in terms of availability, storage overhead and bandwidth. Additionally, empirical evaluations have shown the performance of rateless erasure codes in the context of peer-to-peer storage systems. These evaluations helped to understand how the coding parameters impact on the performance of the architecture. To the best of our knowledge, this is the first comprehensive study that helps application designers in finding the values for the coding parameters to best fit their peer-to-peer context.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

DataCube: a P2P persistent Storage Architecture based on Hybrid Redundancy Schema

This paper presents the design of DataCube, a P2P data persistent platform. This platform exploits the properties of cluster-based peer-to-peer structured overlays altogether with a hybrid redundancy schema (a compound of light replication and rateless erasure coding) to guarantee durable access and integrity of data despite adversarial attacks. In particular, the recovery of damaged data is ac...

متن کامل

ProFlex: A Probabilistic and Flexible Data Storage Protocol for Heterogeneous Wireless Sensor Networks

This paper presents ProFlex, a proactive data distribution protocol for heterogeneous wireless sensor networks (HWSNs). ProFlex guarantees robustness in data retrieval by intelligently managing data replication among selected storage nodes in the network. Contrarily to related protocols in the literature, ProFlex considers the resource constraints of sensor nodes and constructs multiple data re...

متن کامل

Analyse multidimensionnelle de documents via des dimensions OLAP

RÉSUMÉ. Avec l’émergence de formats de données semi-structurés (tels que XML), le stockage de documents dans un entrepôt centralisé est apparu de façon naturelle comme une adaptation des entrepôts de données. De nos jours, les systèmes OLAP (On-Line Analytical Processing) font face à une part grandissante de données non numériques. Cet article présente un environnement pour l’analyse multidimen...

متن کامل

Maintenance de vues XML stockées dans un SGBD relationnel

RÉSUMÉ. Dans le but de garantir l’accès à l’information dans les environnements à large échelle et/ou dynamiques où les sources de données sont autonomes, les données importantes sont souvent rapatriées et stockées d’une manière redondante dans un entrepôt de données. Dans le contexte des bases de données, le mécanisme de vues est utilisé pour fournir une « vue » intégrée des informations distr...

متن کامل

Entrepôts de données sur grilles de calcul

Résumé Les entrepôts de données sont utilisés pour l'exploitation et l'analyse de grands volumes de données extraits des systèmes d'informations en exploitation. Un modèle multidimensionnel organise l'entrepôt de données selon plusieurs axes d'analyse appelés « dimensions ». Les systémes OLAP (OnLine Analytical Processing) permettent une exploration interactive des données contenues dans un ent...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012